Improve Visual Question Answering Based On Text Feature Extraction
نویسندگان
چکیده
منابع مشابه
Speech-Based Visual Question Answering
This paper introduces the task of speech-based visual question answering (VQA), that is, to generate an answer given an image and an associated spoken question. Our work is the first study of speechbased VQA with the intention of providing insights for applications such as speech-based virtual assistants. Two methods are studied: an end to end, deep neural network that directly uses audio wavef...
متن کاملTrigger Extraction in Biological Text for Question Answering
When confronted with a large amount of text, it is difficult to manually extract important and relevant information from it. Having a way to extract that information is a highly interesting research problem that has the potential to improve question answering capabilities. Specifically, identifying sections of text that describe a term, phenomenon, or abstract idea and transforming them into st...
متن کاملFVQA: Fact-based Visual Question Answering
Visual Question Answering (VQA) has attracted much attention in both computer vision and natural language processing communities, not least because it offers insight into the relationships between two important sources of information. Current datasets, and the models built upon them, have focused on questions which are answerable by direct analysis of the question and image alone. The set of su...
متن کاملLearning Convolutional Text Representations for Visual Question Answering
Visual question answering is a recently proposed articial intelligence task that requires a deep understanding of both images and texts. In deep learning, images are typically modeled through convolutional neural networks, and texts are typically modeled through recurrent neural networks. While the requirement for modeling images is similar to traditional computer vision tasks, such as object ...
متن کاملText-to-text generation for question answering
When answering questions, major challenges are (a) to carefully determine the content of the answer and (b) phrase it in a proper way. In IMIX, we focus on two text-to-text generation techniques to accomplish this: content selection and sentence fusion. Using content selection, we can extend answers to an arbitrary length, providing not just a direct answer but also related information so to be...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Physics: Conference Series
سال: 2021
ISSN: 1742-6588,1742-6596
DOI: 10.1088/1742-6596/1856/1/012025